In the addon of the session, we calculated the distances to the closest hospitals located within North-Rhine Westphalia (NRW). Still, we did not show how to subset the original file, which contains all hospitals in Germany.
Subset the data file yourself by relying on the spatial information of the filehospital_points.csv and a polygon of NRW only. How
many hospitals are located within the borders of NRW?
hospital_points.csv in the ./data folder and
polygons of NRW. For the latter, you can again use the
osmdata syntax.
The default of sf::st_join() will leave you with a
‘left-join’ and returns a data object with all hospitals and matching
district information for those which are located within NRW. You can
reset the option to perform an ‘inner-join’ and keep only the
observation which lay within the predefined area
(sf::st_join(x , y, join = "", left = FALSE)).
# load hospitals
hospitals <-
read.csv(
"./data/hospital_points.csv",
header = TRUE,
fill = TRUE,
sep = ","
) %>%
sf::st_as_sf(coords = c("X", "Y"), crs = 3035)
# use the OSM function
nrw <-
osmdata::getbb(
"Nordrhein-Westfalen",
format_out = "sf_polygon"
) %>%
.$multipolygon %>%
sf::st_transform(3035)
# spatial join
nrw_hospitals <-
hospitals %>%
sf::st_join(
# point layer nrw
nrw,
# chose intersect or within
join = sf::st_intersects,
# option FALSE will
# keep only the hospital
# which could be joined
left = FALSE
)
nrow(nrw_hospitals)
## [1] 344
# 344 hospitals in NRW
Did the operationalization of health care provision convince you? Don’t you think it might be more important how many hospitals are close to survey respondents? To test this, we want to calculate the number of hospitals (and/or hospital beds) per district in North-Rhine Westphalia.
dplyr::as_tibble() data frame to use the
functions dplyr::group_by() and
dplyr::summarise().
dplyr::n() allows summarizing the total count
of hospitals. sum(beds) for summarizing the bed total per
district.
nrw_districts <-
sf::read_sf("./data/VG250_KRS.shp") %>%
sf::st_transform(3035) %>%
sf::st_join(nrw, join = sf::st_intersects, left = FALSE)
nrw_hospitals <-
nrw_hospitals %>%
# beds were character, now numeric
dplyr::mutate(beds = as.numeric(beds)) %>%
# replace NAs as zeros for simplification
replace(is.na(.), 0)
district_hospital_join <-
nrw_hospitals %>%
# join the hospitals
# within districts
sf::st_join(nrw_districts, join = sf::st_within) %>%
# use as tibble to perform
# group by & summarise
dplyr::as_tibble() %>%
dplyr::group_by(AGS) %>%
dplyr::summarise(
hospital_count = dplyr::n(),
hospital_bed_count = sum(as.numeric(beds))
) %>%
# left join the new information
# to the original data frame
dplyr::left_join(nrw_districts, .) %>%
# select only usefull columns
dplyr::select(AGS, hospital_count, hospital_bed_count)
## Joining with `by = join_by(AGS)`
summary(district_hospital_join )
## AGS hospital_count hospital_bed_count geometry
## Length:73 Min. : 2.000 Min. : 704 MULTIPOLYGON :73
## Class :character 1st Qu.: 4.000 1st Qu.:1403 epsg:3035 : 0
## Mode :character Median : 6.000 Median :1789 +proj=laea...: 0
## Mean : 6.491 Mean :2236
## 3rd Qu.: 8.000 3rd Qu.:2540
## Max. :22.000 Max. :7083
## NA's :20 NA's :20